About the ARC Challenge

The Abstraction and Reasoning Corpus (ARC) is a dataset that measures general fluid intelligence in AI systems. It consists of tasks where the AI must infer a pattern from a few examples and apply it to new situations.

Each task contains:

  • Training examples showing input-output pairs that demonstrate the pattern
  • A test input where the AI must predict the correct output
  • The ground truth test output for evaluation

This page showcases different transduction models attempting to solve the ARC validation set. For each task, models generate multiple candidate solutions, which are ranked based on various strategies including test-time fine-tuning and reranking approaches.

The visualization allows you to:

  • Compare different model variants and their performance
  • View training examples and test cases
  • Examine candidate solutions generated by the models
  • Track success rates and solution rankings

For implementation details about the models and evaluation process, visit: github.com/xu3kev/BARC/blob/master/seeds/common.py

0 / 0